Which statistics reflect semantics? Rethinking synonymy and word similarity

نویسنده

  • Derrick Higgins
چکیده

A great deal of work has been done of late on the statistical modeling of word similarity relations (cf.Schütze (1992), Lund and Burgess (1996) Landauer and Dumais (1997), Lin (1998), Turney (2001)). While this has largely been viewed as an engineering task (with the notable exception of much writing on Latent Semantic Analysis (LSA)), the relative success of different approaches to constructing word similarity measures is highly relevant to issues in theoretical semantics and language acquisition. With this background in mind, this paper has two main aims. First, we will present yet another statistical approach to the calculation of word-similarity scores (LC-IR), which significantly outperforms other methods on standard benchmarks including the 80-question set of TOEFL® synonym test items first employed by Landauer and Dumais (1997).1 Second, we hope to demonstrate that

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring the Degree of Synonymy between Words Using Relational Similarity between Word Pairs as a Proxy

Two types of similarities between words have been studied in the natural language processing community: synonymy and relational similarity. A high degree of similarity exist between synonymous words. On the other hand, a high degree of relational similarity exists between analogous word pairs. We present and empirically test a hypothesis that links these two types of similarities. Specifically,...

متن کامل

Behavioral profiles: A corpus-based perspective on synonymy and antonymy*

1 Introduction 1.1 Two empirical perspectives in the study of synonymy and antonymy The domain of linguistics that has arguably been studied most from a corpus-linguistic perspective is lexical, or even lexicographical, semantics. Already the early work of pioneers such as Firth and Sinclair has paved the way for the study of lexical items, their distribution, and what their distribution reveal...

متن کامل

Semantics of haq in the Glorious Quran

   Meaning plays a very important role at all levels of linguistic analysis and in linguistics. We can say that the word itself and out of the chain of speech doesn’t show the true meaning. It should be in relation with other signs within the language that its meaning be relived.   Quran, the precious word of Allah, contains words that take a variety of meanings in the syntactic and topical con...

متن کامل

Descriptive Semantics of the Nominal Hapax Legomenon of the Word Menhaj and the Pathology of its Three Translations (Meybodi, Makarem Shirazi and Ansarian)

Understanding the Quran depends upon appreciating meanings of the single words and concepts that are interconnected and interrelated like a chain. Nominal hapax legomenon in the Quran is a word that occurs only once in the holy Quran. Hence, such words need semantic scrutiny since they are difficult to understand. Accordingly, understanding hapax legomenons calls for examining and identifying t...

متن کامل

Computing Semantic Relatedness in German with Revised Information Content Metrics

The paper presents an application of information content based metrics to compute semantic relatedness of word senses in German. The main contributions are: an annotation study based on a revised definition of semantic relatedness beyond synonymy, an extension of Resnik’s (1995) procedure for computing information content of concepts for strongly inflected languages, an application of informati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004